Subjective Measures and their Role in Data Mining Process

نویسنده

  • Ahmed Sultan Al-Hegami
چکیده

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the entire KDD process that involves applying a particular data mining algorithm to extract an interesting knowledge. One of the very important aspects of any data mining task is the evaluation process of the discovered knowledge. Furthermore, the major issue that faces data mining community is how to use our existing knowledge about domain to evaluate the discovered patterns. For the patterns to be interesting, the user has to be involved by providing his/her prior knowledge about domain. While objective measures can be quantified by using statistical methods, subjective measures are determined based on the user understandability of the domain. Use of objective measures of interestingness in popular data mining algorithms often leads to another data mining problem, although of reduced complexity. The reduction in the volume of the discovered patterns is desirable in order to improve the efficiency of the overall KDD process. Subjective measures of interestingness are required to achieve this. In this paper we study the subjective interestingness of the discovered patterns and show their role in extracting novel and interesting knowledge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Actionable Rules: Issues and New Directions

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting...

متن کامل

A Geometric View of Similarity Measures in Data Mining

The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...

متن کامل

A Hybrid Approach for Quantification of Novelty in Rule Discovery

Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules lead to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In thi...

متن کامل

The evaluation and assessment of deprivation level in rural areas Case: central part of Javanrood

Introduction: Poverty and deprivation are being considered as one of the master problem for the government more specifically for planners in many countries. Deprivation and combating these phenomena are in the center of regional planning. In fact regional balance attainment is being pursued as a mater regional planning target. Achieving this goal demands identification of back ward and depri...

متن کامل

Identification of the Patient Requirements Using Lean Six Sigma and Data Mining

Lean health care is one of new managing approaches putting the patient at the core of each change. Lean construction is based on visualization for understanding and prioritizing imporvments. By using only visualization techniques, so much important information could be missed. In order to prioritize and select improvements, it’s essential to integrate new analysis tools to achieve a good unders...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004